Latent Variable Graphical Model Selection via Convex Optimization1 by Venkat Chandrasekaran,
نویسندگان
چکیده
Suppose we observe samples of a subset of a collection of random variables. No additional information is provided about the number of latent variables, nor of the relationship between the latent and observed variables. Is it possible to discover the number of latent components, and to learn a statistical model over the entire collection of variables? We address this question in the setting in which the latent and observed variables are jointly Gaussian, with the conditional statistics of the observed variables conditioned on the latent variables being specified by a graphical model. As a first step we give natural conditions under which such latent-variable Gaussian graphical models are identifiable given marginal statistics of only the observed variables. Essentially these conditions require that the conditional graphical model among the observed variables is sparse, while the effect of the latent variables is “spread out” over most of the observed variables. Next we propose a tractable convex program based on regularized maximum-likelihood for model selection in this latent-variable setting; the regularizer uses both the 1 norm and the nuclear norm. Our modeling framework can be viewed as a combination of dimensionality reduction (to identify latent variables) and graphical modeling (to capture remaining statistical structure not attributable to the latent variables), and it consistently estimates both the number of latent components and the conditional graphical model structure among the observed variables. These results are applicable in the high-dimensional setting in which the number of latent/observed variables grows with the number of samples of the observed variables. The geometric properties of the algebraic varieties of sparse matrices and of low-rank matrices play an important role in our analysis.
منابع مشابه
Rejoinder: Latent Variable Graphical Model Selection via Convex Optimization by Venkat Chandrasekaran,
متن کامل
Latent Variable Graphical Model Selection via Convex Optimization – Supplementary
1. Matrix perturbation bounds. Given a low-rank matrix we consider what happens to the invariant subspaces when the matrix is perturbed by a small amount. We assume without loss of generality that the matrix under consideration is square and symmetric, and our methods can be extended to the general non-symmetric non-square case. We refer the interested reader to [1, 3] for more details, as the ...
متن کاملRejoinder: Latent variable graphical model selection via convex optimization
1. Introduction. We thank all the discussants for their careful reading of our paper, and for their insightful critiques. We would also like to thank the editors for organizing this discussion. Our paper contributes to the area of high-dimensional statistics which has received much attention over the past several years across the statistics, machine learning and signal processing communities. I...
متن کاملSufficient Dimension Reduction and Modeling Responses Conditioned on Covariates: An Integrated Approach via Convex Optimization
Given observations of a collection of covariates and responses (Y,X) ∈ R × R, sufficient dimension reduction (SDR) techniques aim to identify a mapping f : R → R with k q such that Y |f(X) is independent of X. The image f(X) summarizes the relevant information in a potentially large number of covariates X that influence the responses Y . In many contemporary settings, the number of responses p ...
متن کاملConvex optimization methods for graphs and statistical modeling
An outstanding challenge in many problems throughout science and engineering is to succinctly characterize the relationships among a large number of interacting entities. Models based on graphs form one major thrust in this thesis, as graphs often provide a concise representation of the interactions among a large set of variables. A second major emphasis of this thesis are classes of structured...
متن کامل